A Comparison between Supervised Learning Algorithms for Word Sense Disambiguation
نویسندگان
چکیده
This paper describes a set of comparative experiments, including cross{corpus evaluation, between ve alternative algorithms for supervised Word Sense Disambiguation (WSD), namely Naive Bayes, Exemplar-based learning, SNoW, Decision Lists, and Boosting. Two main conclusions can be drawn: 1) The LazyBoosting algorithm outperforms the other four state-of-theart algorithms in terms of accuracy and ability to tune to new domains; 2) The domain dependence of WSD systems seems very strong and suggests that some kind of adaptation or tuning is required for cross{corpus application.
منابع مشابه
Naive Bayes and Exemplar-based Approaches to Word Sense Disambiguation Revisited
This paper describes an experimental comparison between two standard supervised learning methods, namely Naive Bayes and Exemplar–based classification, on the Word Sense Disambiguation (WSD) problem. The aim of the work is twofold. Firstly, it attempts to contribute to clarify some confusing information about the comparison between both methods appearing in the related literature. In doing so, ...
متن کاملSelf-training and co-training in biomedical word sense disambiguation
Word sense disambiguation (WSD) is an intermediate task within information retrieval and information extraction, attempting to select the proper sense of ambiguous words. Due to the scarcity of training data, semi-supervised learning, which profits from seed annotated examples and a large set of unlabeled data, are worth researching. We present preliminary results of two semi-supervised learnin...
متن کاملInvestigating Problems of Semi-supervised Learning for Word Sense Disambiguation
Word Sense Disambiguation (WSD) is the problem of determining the right sense of a polysemous word in a given context. In this paper, we will investigate the use of unlabeled data for WSD within the framework of semi supervised learning, in which the original labeled dataset is iteratively extended by exploiting unlabeled data. This paper addresses two problems occurring in this approach: deter...
متن کاملAn Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation
In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include Support Vector Machines (SVM), Naive Bay...
متن کاملAn Optimized Combinatorial Approach of Learning Algorithm for Word Sense Disambiguation
Word sense disambiguation is the process to find best sense of ambiguous word from the existing senses to remove the ambiguity. This thesis work is an attempt to optimize the word sense disambiguation method. Most commonly supervised machine learning algorithms were used to solve this problem and improve the performance. Some attempts were made to use unsupervised machine learning algorithms al...
متن کامل